{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "%matplotlib inline\n",
    "%config InlineBackend.figure_format = 'retina'\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import YouTubeVideo\n",
    "\n",
    "YouTubeVideo(id=\"3DWSRCbPPJs\", width=\"100%\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you remember, at the beginning of this book,\n",
    "we saw a quote from John Quackenbush that essentially said\n",
    "that the reason a graph is interesting is because of its edges.\n",
    "In this chapter, we'll see this in action once again,\n",
    "as we are going to figure out how to leverage the edges\n",
    "to find special _structures_ in a graph.\n",
    "\n",
    "## Triangles\n",
    "\n",
    "The first structure that we are going to learn about is **triangles**.\n",
    "Triangles are super interesting!\n",
    "They are what one might consider to be\n",
    "\"the simplest complex structure\" in a graph.\n",
    "Triangles can also have semantically-rich meaning depending on the application.\n",
    "To borrow a bad example, love triangles in social networks are generally frowned upon,\n",
    "while on the other hand, when we connect two people that we know together,\n",
    "we instead _complete_ a triangle.\n",
    "\n",
    "### Load Data\n",
    "\n",
    "To learn about triangles,\n",
    "we are going to leverage a physician trust network.\n",
    "Here's the data description:\n",
    "\n",
    "> This directed network captures innovation spread among 246 physicians \n",
    "> for towns in Illinois, Peoria, Bloomington, Quincy and Galesburg.\n",
    "> The data was collected in 1966.\n",
    "> A node represents a physician and an edge between two physicians\n",
    "> shows that the left physician told that the right physician is his friend\n",
    "> or that he turns to the right physician if he needs advice\n",
    "> or is interested in a discussion.\n",
    "> There always only exists one edge between two nodes\n",
    "> even if more than one of the listed conditions are true."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nams import load_data as cf\n",
    "G = cf.load_physicians_network()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: Finding triangles in a graph\n",
    "\n",
    "This exercise is going to flex your ability\n",
    "to \"think on a graph\", just as you did in the previous chapters.\n",
    "\n",
    "> Leveraging what you know, can you think of a few strategies\n",
    "> to find triangles in a graph?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nams.solutions.structures import triangle_finding_strategies\n",
    "\n",
    "# triangle_finding_strategies()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: Identify whether a node is in a triangle relationship or not\n",
    "\n",
    "Let's now get down to implementing this next piece of code.\n",
    "\n",
    "> Write a function that identifies whether a node is or is not in a triangle relationship.\n",
    "> It should take in a graph `G` and a node `n`,\n",
    "> and return a boolean True if the node `n` is in any triangle relationship\n",
    "> and boolean False if the node `n` is not in any triangle relationship.\n",
    "\n",
    "A hint that may help you:\n",
    "\n",
    "> Every graph object `G` has a `G.has_edge(n1, n2)` method that you can use to identify whether a graph has an edge between `n1` and `n2`.\n",
    "\n",
    "Also:\n",
    "\n",
    "> `itertools.combinations` lets you iterate over every _K-combination_ of items in an iterable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "def in_triangle(G, node):\n",
    "    # Your answer here\n",
    "    pass\n",
    "\n",
    "# COMMENT OUT THE IMPORT LINE TO TEST YOUR ANSWER\n",
    "from nams.solutions.structures import in_triangle\n",
    "\n",
    "# UNCOMMENT THE NEXT LINE TO SEE MY ANSWER\n",
    "# in_triangle??"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, test your implementation below!\n",
    "The code cell will not error out if your answer is correct."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from random import sample\n",
    "import networkx as nx\n",
    "\n",
    "def test_in_triangle():\n",
    "    nodes = sample(list(G.nodes()), 10)\n",
    "    for node in nodes:\n",
    "        assert in_triangle(G, 3) == bool(nx.triangles(G, 3))\n",
    "\n",
    "test_in_triangle()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see from the test function above,\n",
    "NetworkX provides an `nx.triangles(G, node)` function.\n",
    "It returns the number of triangles that a node is involved in.\n",
    "We convert it to boolean as a hack to check whether or not\n",
    "a node is involved in a triangle relationship\n",
    "because 0 is equivalent to boolean `False`,\n",
    "while any non-zero number is equivalent to boolean `True`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: Extract triangles for plotting\n",
    "\n",
    "We're going to leverage another piece of knowledge that you already have:\n",
    "the ability to extract subgraphs.\n",
    "We'll be plotting all of the triangles that a node is involved in.\n",
    "\n",
    "> Given a node, write a function that extracts out\n",
    "> all of the neighbors that it is in a triangle relationship with.\n",
    "> Then, in a new function,\n",
    "> implement code that plots only the subgraph\n",
    "> that contains those nodes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_triangle_neighbors(G, n):\n",
    "    # Your answer here\n",
    "    pass\n",
    "\n",
    "# COMMENT OUT THE IMPORT LINE TO TEST YOUR ANSWER\n",
    "from nams.solutions.structures import get_triangle_neighbors\n",
    "\n",
    "# UNCOMMENT THE NEXT LINE TO SEE MY ANSWER\n",
    "# get_triangle_neighbors??"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_triangle_relations(G, n):\n",
    "    # Your answer here\n",
    "    pass\n",
    "\n",
    "# COMMENT OUT THE IMPORT LINE TO TEST YOUR ANSWER\n",
    "from nams.solutions.structures import plot_triangle_relations\n",
    "\n",
    "plot_triangle_relations(G, 3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Triadic Closure\n",
    "\n",
    "In professional circles, making connections between two people\n",
    "is one of the most valuable things you can do professionally.\n",
    "What you do in that moment is what we would call\n",
    "**triadic closure**.\n",
    "Algorithmically, we can do the same thing\n",
    "if we maintain a graph of connections!\n",
    "\n",
    "Essentially, what we are looking for\n",
    "are \"open\" or \"unfinished\" triangles\".\n",
    "\n",
    "In this section, we'll try our hand at implementing\n",
    "a rudimentary triadic closure system.\n",
    "\n",
    "### Exercise: Design the algorithm\n",
    "\n",
    "> What graph logic would you use to identify triadic closure opportunities?\n",
    "> Try writing out your general strategy, or discuss it with someone."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nams.solutions.structures import triadic_closure_algorithm\n",
    "\n",
    "# UNCOMMENT FOR MY ANSWER\n",
    "# triadic_closure_algorithm()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: Implement triadic closure.\n",
    "\n",
    "Now, try your hand at implementing triadic closure.\n",
    "\n",
    "> Write a function that takes in a graph `G` and a node `n`,\n",
    "> and returns all of the neighbors that are potential triadic closures\n",
    "> with `n` being the center node.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_open_triangles_neighbors(G, n):\n",
    "    # Your answer here\n",
    "    pass\n",
    "\n",
    "\n",
    "# COMMENT OUT THE IMPORT LINE TO TEST YOUR ANSWER\n",
    "from nams.solutions.structures import get_open_triangles_neighbors\n",
    "\n",
    "# UNCOMMENT THE NEXT LINE TO SEE MY ANSWER\n",
    "# get_open_triangles_neighbors??"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: Plot the open triangles\n",
    "\n",
    "> Now, write a function that takes in a graph `G` and a node `n`,\n",
    "> and plots out that node `n` and all of the neighbors\n",
    "> that it could help close triangles with.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_open_triangle_relations(G, n):\n",
    "    # Your answer here\n",
    "    pass\n",
    "\n",
    "# COMMENT OUT THE IMPORT LINE TO TEST YOUR ANSWER\n",
    "from nams.solutions.structures import plot_open_triangle_relations\n",
    "\n",
    "plot_open_triangle_relations(G, 3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cliques\n",
    "\n",
    "Triangles are interesting in a graph theoretic setting\n",
    "because triangles are the simplest complex clique that exist.\n",
    "\n",
    "But wait!\n",
    "What is the definition of a \"clique\"?\n",
    "\n",
    "> A \"clique\" is a set of nodes in a graph\n",
    "> that are fully connected with one another\n",
    "> by edges between them.\n",
    "\n",
    "### Exercise: Simplest cliques\n",
    "\n",
    "Given this definition, what is the simplest \"clique\" possible?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nams.solutions.structures import simplest_clique\n",
    "\n",
    "# UNCOMMENT THE NEXT LINE TO SEE MY ANSWER\n",
    "# simplest_clique()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### $k$-Cliques\n",
    "\n",
    "Cliques are identified by their size $k$,\n",
    "which is the number of nodes that are present in the clique.\n",
    "\n",
    "A triangle is what we would consider to be a $k$-clique where $k=3$.\n",
    "\n",
    "A square with cross-diagonal connections is what we would consider to be\n",
    "a $k$-clique where $k=4$.\n",
    "\n",
    "By now, you should get the gist of the idea.\n",
    "\n",
    "### Maximal Cliques\n",
    "\n",
    "Related to this idea of a $k$-clique is another idea called \"maximal cliques\".\n",
    "\n",
    "Maximal cliques are defined as follows:\n",
    "\n",
    "> A maximal clique is a subgraph of nodes in a graph\n",
    "> \n",
    "> 1. to which no other node can be added to it and \n",
    "> 2. still remain a clique.\n",
    "\n",
    "NetworkX provides a way to find all maximal cliques:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# I have truncated the output to the first 5 maximal cliques.\n",
    "list(nx.find_cliques(G))[0:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: finding sized-$k$ maximal cliques\n",
    "\n",
    "> Write a generator function that yields all maximal cliques of size $k$.\n",
    "\n",
    "I'm requesting a generator as a matter of good practice;\n",
    "you never know when the list you return might explode in memory consumption,\n",
    "so generators are a cheap and easy way to reduce memory usage."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def size_k_maximal_cliques(G, k):\n",
    "    # Your answer here\n",
    "    pass\n",
    "\n",
    "\n",
    "# COMMENT OUT THE IMPORT LINE TO TEST YOUR ANSWER\n",
    "from nams.solutions.structures import size_k_maximal_cliques"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, test your implementation against the test function below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_size_k_maximal_cliques(G, k):\n",
    "    clique_generator = size_k_maximal_cliques(G, k)\n",
    "    for clique in clique_generator:\n",
    "        assert len(clique) == k\n",
    "\n",
    "test_size_k_maximal_cliques(G, 5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Clique Decomposition\n",
    "\n",
    "One _super_ neat property of cliques\n",
    "is that every clique of size $k$\n",
    "can be decomposed to the set of cliques of size $k-1$.\n",
    "\n",
    "Does this make sense to you?\n",
    "If not, think about triangles (3-cliques).\n",
    "They can be decomposed to three edges (2-cliques).\n",
    "\n",
    "Think again about 4-cliques.\n",
    "Housed within 4-cliques are four 3-cliques.\n",
    "_Draw it out if you're still not convinced!_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: finding all $k$-cliques in a graph\n",
    "\n",
    "> Knowing this property of $k$-cliques,\n",
    "> write a generator function that yields all $k$-cliques in a graph,\n",
    "> leveraging the `nx.find_cliques(G)` function.\n",
    "\n",
    "Some hints to help you along:\n",
    "\n",
    "> If a $k$-clique can be decomposed to its $k-1$ cliques,\n",
    "> it follows that the $k-1$ cliques can be decomposed into $k-2$ cliques,\n",
    "> and so on until you hit 2-cliques.\n",
    "> This implies that all cliques of size $k$\n",
    "> house cliques of size $n < k$, where $n >= 2$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def find_k_cliques(G, k):\n",
    "    # your answer here\n",
    "    pass\n",
    "\n",
    "# COMMENT OUT THE IMPORT LINE TO TEST YOUR ANSWER\n",
    "from nams.solutions.structures import find_k_cliques\n",
    "\n",
    "def test_find_k_cliques(G, k):\n",
    "    for clique in find_k_cliques(G, k):\n",
    "        assert len(clique) == k\n",
    "\n",
    "test_find_k_cliques(G, 3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Connected Components\n",
    "\n",
    "Now that we've explored a lot around cliques,\n",
    "we're now going to explore this idea of \"connected components\".\n",
    "To do so, I am going to have you draw the graph\n",
    "that we are working with.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import nxviz as nv\n",
    "\n",
    "nv.circos(G)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: Visual insights\n",
    "\n",
    "From this rendering of the CircosPlot,\n",
    "what visual insights do you have about the structure of the graph?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from nams.solutions.structures import visual_insights\n",
    "\n",
    "# UNCOMMENT TO SEE MY ANSWER\n",
    "# visual_insights()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Defining connected components\n",
    "\n",
    "From [Wikipedia](https://en.wikipedia.org/wiki/Connected_component_%28graph_theory%29):\n",
    "\n",
    "> In graph theory, a connected component (or just component) of an undirected graph is a subgraph in which any two vertices are connected to each other by paths, and which is connected to no additional vertices in the supergraph.\n",
    "\n",
    "NetworkX provides a function to let us find all of the connected components:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "ccsubgraph_nodes = list(nx.connected_components(G))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's see how many connected component subgraphs are present:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "len(ccsubgraph_nodes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: visualizing connected component subgraphs\n",
    "\n",
    "In this exercise, we're going to draw a circos plot of the graph, \n",
    "but colour and order the nodes by their connected component subgraph.\n",
    "\n",
    "Recall Circos API:\n",
    "\n",
    "```python\n",
    "c = CircosPlot(G, node_order='node_attribute', node_color='node_attribute')\n",
    "c.draw()\n",
    "plt.show()  # or plt.savefig(...)\n",
    "```\n",
    "\n",
    "Follow the steps along here to accomplish this.\n",
    "\n",
    "> Firstly, label the nodes with a unique identifier for connected component subgraph\n",
    "> that it resides in.\n",
    "> Use `subgraph` to store this piece of metadata."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def label_connected_component_subgraphs(G):\n",
    "    # Your answer here\n",
    "    return G\n",
    "\n",
    "\n",
    "# COMMENT OUT THE IMPORT LINE TO TEST YOUR ANSWER\n",
    "from nams.solutions.structures import label_connected_component_subgraphs\n",
    "G_labelled = label_connected_component_subgraphs(G)\n",
    "\n",
    "# UNCOMMENT TO SEE THE ANSWER\n",
    "# label_connected_component_subgraphs??"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Now, draw a CircosPlot with the node order and colouring\n",
    "> dictated by the `subgraph` key."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def plot_cc_subgraph(G):\n",
    "    # Your answer here\n",
    "    pass\n",
    "\n",
    "\n",
    "# COMMENT OUT THE IMPORT LINE TO TEST YOUR ANSWER\n",
    "from nams.solutions.structures import plot_cc_subgraph\n",
    "from nxviz import annotate\n",
    "\n",
    "plot_cc_subgraph(G_labelled)\n",
    "annotate.circos_group(G_labelled, group_by=\"subgraph\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using an arc plot will also clearly illuminate for us\n",
    "that there are no inter-group connections."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "nv.arc(G_labelled, group_by=\"subgraph\", node_color_by=\"subgraph\")\n",
    "annotate.arc_group(G_labelled, group_by=\"subgraph\", rotation=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_Voila!_ It looks quite clear that there are indeed four disjoint group of physicians."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Solutions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from nams.solutions import structures\n",
    "import inspect\n",
    "\n",
    "print(inspect.getsource(structures))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "nams",
   "language": "python",
   "name": "nams"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
